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Abstract: The first-order moving average model or MA(1) is given by Xt = 
Zt — 9QZt—i, with independent and identically distributed {Zt}. This is ar- 
guably the simplest time series model that one can write down. The MA(1) 
with unit root (Sq = 1) arises naturally in a variety of time series applications. 
For example, if an underlying time series consists of a linear trend plus white 
noise errors, then the differenced series is an MA(1) with unit root. In such 
cases, testing for a unit root of the differenced series is equivalent to testing 
the adequacy of the trend plus noise model. The unit root problem also arises 
naturally in a signal plus noise model in which the signal is modeled as a ran- 
dom walk. The differenced series follows a MA(1) model and has a unit root 
if and only if the random walk signal is in fact a constant. 

The asymptotic theory of various estimators based on Gaussian likeli- 
hood has been developed for the unit root case and nearly unit root case 
{8 = l+l3/n,f} < 0). Unlike standard l/y'ji-asymptotics, these estimation pro- 
cedures have 1/n-asymptotics and a so-called pile-up effect, in which P(S = 1) 
converges to a positive value. One explanation for this pile-up phenomenon 
is the lack of identifiability of 6 in the Gaussian case. That is, the Gaussian 
likelihood has the same value for the two sets of parameter values (6,cr^) and 
{1/9, d^a"^). It follows that S = 1 is always a critical point of the likelihood 
function. In contrast, for non-Gaussian noise, 6 is identifiable for all real values. 
Hence it is no longer clear whether or not the same pile-up phenomenon will 
persist in the non-Gaussian case. In this paper, we focus on limiting pile-up 
probabilities for estimates of do based on a Laplace likelihood. In some cases, 
these estimates can be viewed as Least Absolute Deviation (LAD) estimates. 
Simulation results illustrate the limit theory. 



1. Introduction 



The moving average model of order one (MA(1)) given by 
(1.1) Xt^Zt-B^Zt-i, 
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where {Zt\ is a sequence of independent and identically distributed random vari- 
ables with mean and variance cr^, is one of the simplest models in time series. 
The MA(1) model is invertible if and only if 16*01 < 1, since in this case Zt can be 
represented explicitly in terms of past values of the Xf, i.e., 

oo 

Under this invertibility constraint, standard estimation procedures that produce 
asymptotically normal estimates are readily available. For example, if 9 represents 
the maximum likelihood estimator, found by maximizing the Gaussian likelihood 
based on the data Xi, . . . , X„, then it is well known (see Brockwell and Davis 3]), 
that 

(1.2) V^{e-e^)^ N{Q,i-el). 

From the form of the limiting variance in \1.2\ . the asymptotic behavior of 9, let 
alone the scaling, is not immediately clear in the unit root case corresponding to 
Oo = 1. 

In the Gaussian case, the parameters 6q and cr^ are not identifiable without the 
constraint \9o\ < 1. In particular, the profile Gaussian log- likelihood, obtained by 
concentrating out the variance parameter, satisfies 

L{9)=L{l/0). 

It follows that = 1 is a critical value of the profile likelihood and hence there is 
a positive probability that ^ = 1 is indeed the maximum likelihood estimator. If 
9q = 1, then it turns out that this probability does not vanish asymptotically (see 
for example Anderson and Takemura [if, Tanaka 0], and Davis and Dunsmuir Q). 
This phenomenon is referred to as the pile- up effect. For the case that = 1 or is 
near one in the sense that 9o — 1 + 7/n, it was shown in Davis and Dunsmuir 
that 

ni9~9o)^^j, 

where ^-y is random variable with a discrete component at 0, corresponding to the 
asymptotic pile-up probability, and a continuous component on (— oo,0). 

The MA(1) with unit root (^o = 1) arises naturally in a variety of time series 
applications. For example, if an underlying time series consists of a linear trend plus 
white noise errors, then the differenced series is an MA(1) with a unit root. In such 
cases, testing for a unit root of the differenced series is equivalent to testing the 
adequacy of the trend plus noise model. The unit root problem also arises naturally 
in a signal plus noise model in which the signal is modeled as a random walk. The 
differenced series follows a MA(1) model and has a unit root if and only if the 
random walk signal is in fact a constant. 

For Gaussian likelihood estimation, the pile-up effect is directly attributable 
to the non-identifiability of in the unconstrained parameter space. On the other 
hand, if the data are non-Gaussian, then 9q is identifiable (see Breidt and Davis 0). 
In this paper, we focus on the pile-up probability for estimates based on a Laplace 
likelihood. Assuming a Laplace distribution for the noise, we derive an expression 
for the joint likelihood of 9 and Zinit, where Zmit is an augmented variable that 
is treated as a parameter and the scale parameter a is concentrated out of the 
likelihood. If Zinit is set equal to 0, then the resulting joint likelihood corresponds 
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to the least absolute deviation (LAD) objective function and the estimator of 9 
is referred to as the LAD estimator of 6*9. The exact likelihood can be obtained 
by integrating out Zinu- In this case the resulting estimator is referred to as the 
quasi-maximum likelihood estimator of 9q. It turns out that the estimator based on 
maximizing the joint likelihood always has a positive pile-up probability in the limit 
regardless of the true noise distribution. In contrast, the quasi-maximum likelihood 
estimator has a limiting pile-up probability of zero. 

In Section 2, we describe the main asymptotic results. We begin by deriving an 
expression for computing the joint likelihood function based on the observed data 
and the augmented variable Zinu^ in terms of the density function of the noise. 
The exact likelihood function can then be computed by integrating out Zinu- After 
a reparameterizion, we derive the limiting behavior of the joint likelihood for the 
case when the noise is assumed to follow a Laplace distribution. In Section 3, we 
focus on the problem of calculating asymptotic pile-up probabilities for estimators 
which minimize the joint Laplace likelihood (as a function of and Zinu) and the 
exact Laplace likelihood. Section 4 contains simulation results which illustrate the 
asymptotic theory of Section 3. 

2. Main result 

Let {Xt] be the MA(1) model given in (jl.ip where 9q G K, {Zt} is a sequence of 
iid random variables with EZt — and density function fz- In order to compute 
the likelihood based on the observed data X„ — (Xi, . . . , X„)', it is convenient to 
define an augmented initial variable Zinu defined by 

\ Z„ - X]r=i -'^t' otherwise. 

A straightforward calculation shows that the joint density of the observed data 
Xn — {Xi, X2, . . . , Xn)' and the initial variable Zinn satisfies 

n 

where the residuals {zt} are functions of Xn = a;„, 9, and Zinu — Zinu which can 
be solved forward by zt — Xt + 9zt-i for i = 1, 2, . . . , n with the initial zq = Zmit if 
\9\ < 1 and backward by zt-i — 9^^{zt — Xt) for t = n,n — 1, . . . ,1 with the initial 

Zn = Zinit + Z]"=l ^t, if 16*1 > 1- 

The Laplace log-likelihood is obtained by taking the density function for Zt 
to be fz{z) = exp{— |z|/cr}/(2cr). If we view Zinit as a parameter, then the joint 
log-likelihood is given by 

1 " 

(2.1) -(" + !) log2a - - ^ \zt\ - n(log \9\)l^\e\>i} ■ 

t=o 

Maximizing this function with respect to the scale parameter a, we obtain 



n 

a = ^|z,|/(n + l). 
t=o 
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It follows that maximizing the joint Laplace log-likelihood is equivalent to minimiz- 
ing the following objective function, 



(2.2) 



Ylt=o \ ^t\\0\, otherwise. 



In order to study the asymptotic properties of the minimizer of in when the 
model ^0 = 1, we follow Davis and Dunsmuir ^ by building the sample size into 
the parameterization of 9. Specifically, we use 



(2.3) 



/3 
n 



where /3 is any real number. Additionally, since we are also treating Zinu as a 
parameter, this term is reparameterized as 



(2.4) 



Under the (/3, a) parameterization, minimizing £„ with respect to and Zinit is 
equivalent to minimizing the function. 
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with respect to (3 and a. The following theorem describes the limiting behavior 
of C/„. 

Theorem 2.1. For the model (jl.ip with 0q = 1, assume the noise sequence {Zt} 
is IID with EZt — 0, E[ sign{Zt)] — (i.e., median of Zt is zero), EZf < oo and 
common probability density function fz{z) — f {z / a) , where a > is the scale 
parameter. We further assume that the density function fz has been normalized so 
that a — E\Zt\. Then 



(2.5) 



fidi 



where —> denotes convergence in distribution of finite dimensional distributions 
and 

f-i 



Ui/3,a) = 

(2.6) 

for P < 0, and 

U{p,a) 

(2.7) 







P / e'^^'-*'>dS{t)+ae'^' 







dW{s) 



13 / ef^'^'-'US{t)+ae^' 
I Jo 



ds. 



-f3 



-/3(t-s) 



dS{t) + ae 



-I3{l-s) 



s+ 



dW{s) 



+/(0) 







13 / e-^(*-^)d5(t)+ae-''(i-^) 



ds, 



for P > 0, in which S{t) and W{t) are the limits of the following partial sums 

[7it] [nt] 

Snit) = ^ V Z,/a, Wnit) - ^ V signiZ,), 



respectively. 
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Remark. The stochastic integrals in (|2.6p and (|2.7[) refer to Ito integrals. The 
double stochastic stochastic integral in the first term on the right side of (|2.7p is 
computed as 



1 /.I 



^^'^''^dS{t)dW{s) = / er^'dS{t) / e^'dW{s) 

J s+ 



1 i-s ^1 



-'3(*-")d5'(i)dT/F(s) - / dS{t)dW{t), 



where (see p.lSp below) 

dS{t)dW{t) ^ E{Zisign{Zi))/a = E\Z,\/a = 1 



Proof. We only prove the result (|2.5p for a fixed (/3, a); the extension to a finite 
collection of {f3, a)'s is relatively straightforward. First consider the case /3 < 0. For 
calculating the Laplace likehhood in{d, Zinit) based on model ()l.ip . the residuals are 
solved by zt ^ Xt + 0Zt-i for t = 1, 2, . . . , rt with the initial value zq — Zinu- Since 
Xt = Zt^ Zt-i, all of the true innovations can be solved forward by Zt — Xt + Zt-i 
for t = 1,2, ... ,n with the initial Zq. Therefore, the centered term £„{!, Zq) can be 
written as 

n n 
^„(1, Zq) = \Zq\ X^-l + • • • + Xi + | = ^ | | . 

i=l i=0 

For /3 < 0, i.e., 9 <1, 

Zi = Xi + OXi^i + ■ ■ ■ + 9^ ^Xi + 9'^Zinit 

= (Z, - + - Z,^2) + ■■■ + 9'-HZi - Zq) + 9'z„,u 

= Z,-{1- 9)Z^-i - 9{1 - 9)Z.,-2 9'-\l - 9)Zq - 9\Zq - z^^t), 

which, under the true model 9=1, implies 

1 1 / " " \ 

- [iniO, Z,mt) - tn{l, Zq)] = " I ^ l^^^l " E I I 1 

(2.8) " " ^ 

= -J2m-y^\-\z.\), 

i=0 

where yo = Zq - Zinit and 

i-l 

= {1~9)J2 d'-^-'Z, + 9\Zq - z„nt). 

3=0 

for i = 1, 2, . . . , n. Using the identity 

(2.9) \Z-y\- \Z\ = -y sign(Z) + 2{y - Z) (l{o<z<,} - l{,<z<o}) 
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for Z ^ Q, the equation (|2.8p is expressed as two summations, the first of which is 



n n / i~l „ \ 

Y: ^ sign(Z.) = - 1) E E - «ig^(^») 



i=0 



+ 



E^' sign(ZO 



(2.10) 



^E(i + -) -gn(^^ 

^ i—O 



sign(2'i 



/? / / (1 + -) "*d5„(t) + ' 
Iq Jq \ nj \ n 



dWnis) 



a 



^'(^1 + dWjs) 



Jo Jo Jo 



where the hmit in p.lOp follows from a simple adaptation of Theorem 2.4 (ii) in 
Chan and Wei [4] . 

To handle the second summation in computing C/„(/3, a), we approximate the 
sum 

" y ~ z 

E ^ {^{0<Zi<yi} - l{yi<Zi<0}) 

1=0 



by 



E2^ 



where J-i is the (T-ficld generated by {Zj : j = 0,1, . . . First we establish conver- 
gence of the latter sum and then show that the variance of the difference in sums 
converges to zero. Since 



max ^ 0, 

l<t<n 



Hi £ Ti-\, we have 

'Vi - Zi 



2E 



'^{0<Zi<yi}\^i~l 
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for j/i > 0, and 



2E 



^{yi<Z,<0}\^i-\ 



Vi- z\ 1 z 



-f{-)dz 



/(O) 



a J a a 



-/(0)(f)' 



for iji < 0. Combining these two cases, we have 



i=0 



Vi - Zi 



(l{0<Z.<y.} - l{y,<Zi<0}) 



n 

/(o)E(f)" 



i=0 



where 



n 9 " ^ 



i=0 



(2.11) 



: ^0 - ^0 



E 

n 

E 

1=1 



P 1^ (l + ^ 



/3 / e'3(^-*'d5(t)+ae'3^ 



dSn{s) + a[ 1 + - 



1 

n 



in distribution as n — s- cx). 
It is left to show that 



2E 



Vi - Z.i 



L{0<Z.<ij.} - l{y.<Zi< 



CO}) 



(2.12) 



i=0 



2E^ 

i=0 



l-{0<Z,<y,} - -'-{y,<2,<0}) 



converges to zero in probabiUty. Define 



y* = 2 (l{o<z,<y,} - l{a,<z,<o}) 



The expectation of (|2.12p is zero and therefore, it is enough to show that the 
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variance of (|2.12[) also converges to zero. The variance of (|2.12|) is equal to 

n 

J2 var {y* - E + 2 ^ cov {y* - E ,y* - E {y*\T,^i)) 

1^0 i<'j 
71 

= J2E[y:-E{y*\T,_,)f 

n 

= Y.EE [{y:f-^{E{y:\T..,)f\T.., 

n 

(2.13) =Y.E[^i(y*f\^^-^) iEiy:\:F,-i)f 

i=0 



1=0 



Ed 



Li=0 



- fiOfE 



Ed 



.i=0 



as n ^ oo, where 

cov {y* - E ,y* - E {y*\T,-,)) 

= E [y* - E (yn^^^i)] [y* - E {:y*\T,^^)\ 

= EE {y* - E (yn^.-i)) (y* " E {y*\T,^i)) 

= E 
= 0, 



[y* - E E[y*~E {y*\J',-i 



for i < j, and 



E 



E{{y:f\T,^ 

i=0 

"Ed 



« /(O) (^)', 



i=0 



(3 / e^("-*)d5(i)+ae'^' ) ds, 

\ "^O 



\ "'0 



Based on ((2A0)) . ((2ll|) . and ((2J3)) . the proof for /? < is complete. 

The proof for /? > given in (|2.7p is similar to that for /3 < 0. For /? > 0, 
i.e., 6* > 1, the residuals {zt} are solved backward by zt-i = 9^^{zt — Xt) for 
t = n,n — with the initial z„ = Zinu + X]"=i -^t- Solving these equations, 

we have 
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for i = 0, 1, . . . , n — 1. Writing Xt = Zt — Zt-i, we obtain 

— Zn-l-iO ^ Xn^i + 9 Xn-i-l + ■ ■ ■ + ^ Xn — 9 

= {Zn-i — Zn-i-l) + 9 ^{Zn-i+l — Zn^i) + • • • 
+ & ^{Zn — Zn-l) — 9 ^Zn 

= -Z„_,_i + (1 - 9-^)Z,-,^, + ■■■ + - 9-^)Z„-i 

+ 9-\Zr, - z„) 



where 



.i-^ = {i-9-')j2{0''y''z^a-j 



i=i 



[i-9-')Y,{o-'y-'z^^, 



i=i 



= {i-9-')Y,i9-'y-^z^^, 

i=i 



{Zn — Zn) 

' n \ / " ^ 

J2x, + Zo]~[J2x,+ 

Zinit 

(Zq — Zinit), 



for i = 0, 1 and y„ = Z„ — z„ = Zq — ^init- Again, for > 1, we have 



ni l ^init 



rr — ^ 



1=0 



which has the same form as that for 9 < \ but with different {j/i}- FoUowing a 
similar derivation for < 1, one can show that 



■ V - sign(Z,) ^ -/? / / e-^''*-'USit)dWis)+a I e-^''^-'Uw{s), 
,;=! Jo Js+ 



E 

1=0 



-13 / e-^(*-^)rf5(i) + ae-^(i^^' 



in distribution as n ^ oo. Combining this with the anafogous result (|2.f 3p for 
/3 > 0, completes the proof. □ 



We close this section with some elementary results concerning the relationship 
between the limiting Brownian motions S{t) and W{t) that will be used in the 
sequel. Since a = E\Zt\, the process S{t) can be decomposed as 



(2.14) 



S{t) = W{t) + cV{t) , 



where {W^(t)} and {V^(i)} are independent standard Bronwnian motions on [0, 1] 
and 



c= v/Var(Zi)/cr2 _ l , 
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In addition, we have the foUowing identities 

V{s)ds^V{l)- [ sdV{s), 
Jo 

V{s)dW{s) = V{l)W{l)- [ W(s)dV{s), 
Jo 

dW{s)dW{s) = / ds = l, 

^0 

1 

dV{s)dW{s) = 0, 



where the first two equations can be obtained easily by integration by parts. It 
follows that 

(2.15) / dSis)dWis) = [ dW{s)dW{s) + c [ dV{s)dW{s) = 1 . 

Jo Jo Jo 

3. Pile-up probabilities 

3.1. Joint likelihood 

In this section, we will consider the local maximizer of the joint likelihood given 
by —£n in (|2.2p . This estimator was also studied by Davis and Dunsmuir in the 
Gaussian case. Denote by {On \z\'U^ „) the local minimizer of £„(0, Zinit) in which 
O^P is closest to 1. Using the {(3, a) parameterization given in ()2.3p and ()2.4p . this 

is equivalent to finding the local minimizer 0k'^\ di"''') of Un((3, a) in which /Jr/' is 
closest to zero. Moreover, the respective local minimizers of ^„ and J7„ are connected 
through the following relations: 

(3.1) ^i^) = l + ^, €L = ^o + ^. 

If the convergence of Un to U in Theorem 1 is strengthened to weak convergence 
of processes on C(R^), then the argument given in Davis and Dunsmuir Q suggests 
the convergence in distribution of (jin\a"n^) to (/3'''-', a*^'^-'), where (/3'^'^\ d^"^^) is 
the local minimizer of C/(/3, a) in which fi^"^^ is closest to 0. It follows that 

(3.2) (n(^(/) - 1), V^(41t.„ - ^o)/a) - d('^)) . 

The proofs of these results are the subject of on-going research and will appear in 
a forthcoming manuscript. 

Turning to the question of pile-up probabilities, we have that 1 is a local min- 
imizer if the derivative of the criterion function from the left is negative and the 
derivative from the right is positive; that is, 



P(^(-^)^l) = P(/3(/)=0) 
= P 



lim— C/„ (/3,d„(/3)) < and lim— J7„ (^,d„(/3)) > 
/3T0 op 0\.o op 
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where d„(/3) = argmiiia Un{/3, a) for given (3. Assuming convergence of the right- 

and left-hand derivatives of the process L/„(/3, d„(/3)), we obtain 

(3.3) 

d d ~ 

< and Um — C/ (/3, d(/3)) > , 



hm P{9if'^ = i) = P 



where d(/3) = argminc U{j3,a). We now proceed to simpUfy the hmits of the two 
derivatives in the brackets of (|3.3p in terms of the processes S{t) and W{t). Ac- 
cording to (|2.6p in Theorem 12. 1[ we have 



Hm^C/(/3,a) = hm<j / e^^'dWis) + f{0)2a I e'^^'ds 
/3T0 9a /3T0 ' ' ' 



dWis) + 2af{0) I ds 
W{l) + 2af{0), 



and therefore 



a(0- 



W{1) 
2/(0)- 



The derivative of U{(3, a) with respect to /3 at zero from the left-hand side satisfies 



d(3 



Uif3,a) 



Jo 



e^^'~''>dS{t)dW{s) +f3 / e'^^^-*) (s - t)dS{t)dWis) 
Jo Jo 







a I e^'sdWis) 

nl 

/O \.^0 



/(O) <^ 2/3 



p/5(s-t). 



+ 2a/3^' e'^" (^J^ e^'^'-'\2s - t)dS{t)^ dt 



Taking the limit as /? t 0, we have 

»1 fS 



lim|t/(Ad(^)) 



dS'(t)dW^(s) -f d(O-) / sdW(s) 



-H/(0) <^ d^(O-) / 2sds + 2d(0-) 



1 fS 



dS{t)ds 



(3.4) 



S{s)dW{s)-W{l) / S'(s)ds 
PF(1) 



2/(0) 



'Wis)ds-^ 



=: Y. 
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Similarly, according to (|2.7p in Theorem 12.11 we have 

lim -^UiP, a) = lim I / er'^^^-''>dW(s) + f(Q)2a I e~2/3(i-s)^g 



dW{s) + 2af{0) I ds 
W{l) + 2af{0), 



and therefore 



d(0+) = - 



W{1) 
2/(0)' 



which is same as (5(0—). The derivative of U{/3,a) with respect to /3 at zero from 
righthand side satisfies 



^U{l3,a)^- I I e-^^'-'^dS{t)dW{s)- f3 [ [ e-^^'-''> {s - t)dS{t)dW{s) 
op Jo Js+ Jo Js 

+ a [ e-^'''^-'\s-l)dW{s) 
Jo 

+ /(O) hpj^ (^J' e-^(*-^)d5(i)) ds 



-I3{t-s) 



dS{t) 



(^j\-'^^^'-'\s-t)dS{t)^ ds 



+ / e-2/3(i-)2(s-l)ds 



2a 



2a/3 



-/3(l-s) 



)d5(t) ) ds 



-0(-L+t-^'>)(^2s-t-l)dS{t)ds^ 



Taking the limit /3 i and using the remark in Section 2, we have 



dS{t)dW{s)+a{0+) {s-l)dW{s) 



Js+ 



+ /(0) <^ ^2(0+) / 2(s- l)rfs-2a(0+) 



5(1)M^(1)+ / S{s)dWis) + l + a{0+) 



dS{t)ds 



[{s - l)W{s)]l - / Wis)d 



+ f{0)\-a^i0+)-2a{0+) 



S{s)dW{s)~Wil) / S{s)ds + 



3(1)- / S{s)ds 
Jo 

W{1) r 



2/(0) 



W{s)ds - 



iy(l) 



= r + i. 
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Therefore, the pile-up probabihty in (|3.3p can be expressed in terms of Y as 
hm P{e[p 1) P [r < and r + 1 > 0] 

n — >oo 

= p[-i< r < 0] . 

3.2. Exact likelihood estimation 

In this section, we consider pile-up probabilities associated with the estimator that 
maximizes the exact Laplace likelihood. For < 1, the joint density of (a;„, Zinu) 
satisfies 



/(a;„,z.™0 = n/(^*)=hr exp -^^i^ 



t=o 



Tl+l 



n+l 



Integrating out the augmented variable Zinit, we obtain 

f{x.^,z,^,,)dz,r.u - (^)"^'cxp (^-^^k^^ _^ e-^"('''")da, 

since under the parameterization (j2.4p . dzinit — {(J / \/n)da. The Laplace log- likeli- 
hood of (6*, (t) given a;„ then satisfies 



t„{0, a) =\0g f{Xn,Zinit)dZinit 

J — OO 

= -{n + 1) log(2a) - + log ( ^) + log e'^-^^'^^da, 

O- J-oc 

where the last term does not depend on cr as n ^ cx). So maximizing ^* with respect 
to < 1 is approximately the same as maximizing 

/■OO 

(3.5) log/ e-^"('3^")da 



OO 



with respect to /? < 0, 

Similarly, for 6 > 1, the Laplace log-likelihood of {0,a) is 



(6*, cr) = log / fix„, Zinit)dZinit 



= -nlog|6l| - (n + l)log(2cr) - 



a\e\ 

+ log(^^)+logy'°°e-^"(^^")l^l"^da, 

where again the last term does not depend on ct as ri ^ cx). As above, maximizing 
£* with respect to > 1 is equivalent to maximizing 

/■OO 

(3.6) U*{P)^\og e-^"('''")"/("+'3)da 
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for P>0. 

A heuristic argument based on the process convergence of [/„ to U suggests that 



(3.7) 



U* (13) ~^U*{P)^ log e-^^P-'^Ua 



where U* is specified by for /3 < and by dHS]) for /? > 0. Now if (3^'^ 

denotes the focal maximum of the exact Ukelihood, or alternatively the maximizer 
of U*{(3) that is closest to 0, then the convergence in p.7p suggests convergence in 
distribution for the local maximizer of the exact likelihood, i.e., 



(3.8) 



i(^f ) - 1) = /3,f ) 4 , 



where /3^^-' is the local maximizer of U*{(3) that is closest to 0. 
The limiting pile-up probabilities for On are calculated from 

lim P(^i^) = 1) = lim P(/3i^) = 0) = P(/3(^) = 0) 

n— >oo n—*oo 

lim— C/*(/3) > and lim— [/*(/3) < 
/3T0 dp pio d(3 



Fortunately, the right- and left-hand derivatives of U* can be computed explicitly. 
These are found to be 



lim^C/*(/3) 

(310 dp 



W^{1) W{1) 



4/(0) 2/(0) /o 
1 



W{s)ds-W{l) / S(s)ds^^ / S[s)dW{s) 



lim— [/*(/3) 
/3io dp 



Y 



1 

2' 



1^2(1) ^(1) /•! 



4/(0) 2/(0) /o 
1 



W{s)ds-W(\) \ S{s)ds+ / S{s)dW{s) 



= Y 



1 

2' 



where Y is defined in (|3.4p . The limiting pile- up probability for §1^^ is then 



lim P(^l^) = 1) = P 



2 2 



0. 



3.3. Remarks 



Here we collect several remarks concerning the results of Sections 3.1 and 3.2. 

Remark 1. Under the assumptions of Theorem 2.1, the asymptotic pile- up prob- 
ability for estimator On^^ based on the joint likelihood is always positive. On the 
other hand, the asymptotic pile-up probability for estimator Oh. based on the exact 
likelihood is zero. 
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Remark 2. The two estimators of 6*0 considered in Sections 3.1 and 3.2 were defined 
as the local optimizers of objective functions that were closest to 1. One could also 
consider the global optimizers of these objective functions. For example, the exact 
MLE in the Gaussian case was considered in Davis and Dunsmuir [6] and Davis, 
Chen and Dunsmuir [Q] and has a different limiting distribution than the local MLE. 
In our case, there will be a positive asymptotic pile-up probability for the global 
maximum of the joint likelihood and a zero asymptotic pile-up probability for the 
global maximum of the exact likelihood. 

Remark 3. Suppose Zt has a Laplace distribution with the density function 



\z\/a 



Then Y defined in 



satisfies 



(3.9) 



Y 



[Wil)s-W{s)]dV{s) 



2' 



where W{s) and V{s) are independent standard Brownian motions. To prove (|3.9p . 
note that the constant c in (|2.14p is equal to 1 so that 

S{t) = Wit) + Vit). 

In the following calculations, we use the well-known Ito formula 



W{s)dW{s) ^ ~ -. 







Since /(O) — 1/2, the random variable Y defined in (|3.4p can be further simplified 
in terms of W{t) and V{t) as 



Y = I S{s)dWis) - W{1) [ Sis)ds + ^^^^ 



^ W(l) 
W{s)ds - 



1^2(1) 



JO 2/(0) L.yo 

V{s)dW{s) + I W{s)dW{s) - W{1) I V{s)ds ~ W{1) I W{s)ds 
Jo Jo 

+W{1) / W{s)ds~—^ 
Jo 2 

V{l)W{l)~ Wis)dV{s) + ^-^~ -~W{1) 



V{1)~ I sdV(s) 





[W{l)s~W{s)]dVis) 
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Therefore, the pile-up probability for Laplace innovations is 



p(-i< y < 0) 



1 



E 



E 



P 



P 



< 



[W{l)s~Wis)] dV{s) < - 



[Wil)s~W{s)]dV{s) < 



W{t) on t € [0, 1] 



[W{l)s - W{s)]^ds 



-1/2 




Wis)fds 



< U 

-1/2N 



[W(l)s-W(s)Yds 



[W{l)s-W{s)Yds 



-1/2N 



0.820, 



where U has the standard normal distribution and <!>(•) is the corresponding cu- 
mulative distribution function. This pile-up probability, which was computed via 
simulation based on 100000 replications of W{t) on [0, 1], has a standard error of 
0.0010. 

Remark 4. From the limiting result p.2p . it follows that the random variable Zq 
can be estimated consistently. It may seem odd to have a consistent estimate of a 
noise term in a moving average process. On the other hand, an MA(1) process with 
a unit root is both invertible and non-invertible. That is, Zq is an element of the 
two Hilbert spaces generated by the linear span of {Xt,t < 0} and {Xt,t > 1}, 
respectively. It is the latter Hilbert space which allows for consistent estimation 
of Zo. 



4. Numerical simulation 



In this section, we compute the asymptotic pile-up probabilities associated with 
the estimator 9^'^^ which maximizes the joint Laplace likelihood for several dif- 
ferent noise distributions. The empirical properties of estimators oiP (the local 
maximizer of the joint Laplace likelihood) and 9n (the local maximizer of the 
exact Laplace likelihood) for finite samples are compared with each other and with 
the corresponding asymptotic theory. 

For approximating the asymptotic pile-up probabilities and limiting distribution 
, we first simulate 100000 replications of independent standard Wiener pro- 
cesses W{t) and V{t) on [0, 1] in which W{t) and V{t) are approximated by the 
partial sums W{t) = Wj/VTOOOO and Vit) = J2f°°°°'^ Vj/VTOOOO, where 

{VFj} and {Vj} are independent standard normal random variables. From the sim- 
ulation of W{t) and V{t), the distribution of the limit random variable /J*^"^^ can 
be tabulated and the pile-up probability P{—1 < Y < 0) estimated, where Y is 
given in p.4p . The empirical pile- up probabilities and their asymptotic limits are 
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displayed in Table [T] for different noise distributions: Laplace, Gaussian, uniform, 
and t with 5 degrees of freedom. Notice that there is good agreement between the 
asymptotic and empirical probabilities for sample sizes as small as 50. 

For examining the empirical performance of the local maximizers O^^^ and , 
we only consider the process generated with Laplace noise with a = 1 and sample 
sizes n = 20, 50, 100, 200. For each setup, 1000 realizations of the MA(1) process 
with 6o = I are generated and the estimates §1/^ and 0^'' and their corresponding 
estimates of the scale parameter are obtained. The estimation results are sum- 
marized in Table [2l For comparison, the standard deviation based on the limit 
distributions of ^i'^-* and ^i^-* are also reported (denoted by asymp in the table), 
which are obtained numerically based on 100000 replicates of the limit process U. 
Generally speaking, the empirical root mean square errors are very close to their 
asymptotic values even for very small samples. Moreover, the estimation error of 
oif^ is about 1/2 the estimation error of 9'if\ which indicates the superiority of 
using the joint likelihood over exact likelihood when 9q — 1. 

We also considered performance of the two estimators 9iP and ^i^"* in the case 
when 9q ^ 1. A limit theory for these estimators can be derived in this case by 
assuming that the true value 9o is near 1. That is, we can parameterize the MA(1) 
parameter by = 1 + 7/71 (c-g-, Davis and Dunsmuir While we have not 
pursued the theory in the near unit root case, the relative performance of these 



Table 1 

Empirical pile-up probabilities of the local maximizer f)^^^ of the joint Laplace likelihood for an 
MA(1) with 60 = 1 and sample sizes n = 20,50, 100,200 (based on 1000 replicates) and their 
asymptotic values under various noise distributions. 



n 


Gau 


Lap 


Unif 


i(5) 


20 


0.827 


0.796 


0.831 


0.796 


50 


0.859 


0.806 


0.864 


0.823 


100 


0.873 


0.819 


0.864 


0.817 


200 


0.844 


0.819 


0.843 


0.831 


500 


0.855 


0.809 


0.841 


0.846 


00 


0.873 


0.820 


0.862 


0.836 



Table 2 

Bias, standard deviation and root mean square error of the local maximizers 9^"^' and 9^'^ of 
the joint and exact Laplace likelihoods, respectively, for an MA(1) process generated by Laplace 



noise with 6, 


3 = 1 and a 


= 1 flOOO replications). 


n 












fn 


n = 20 


bias 


-0.003 


-0.006 




s.d. 


0.066 


0.144 




rmse 


0.066 


0.144 




asymp 


0.053 


0.121 


n = 50 


bias 


-0.000 


0.000 




s.d. 


0.021 


0.057 




rmse 


0.021 


0.057 




asymp 


0.021 


0.048 


n = 100 


bias 


-0.000 


0.001 




s.d. 


0.011 


0.030 




rmse 


0.011 


0.030 




asymp 


0.011 


0.024 


n = 200 


bias 


0.000 


0.001 




s.d. 


0.006 


0.014 




rmse 


0.006 


0.014 




asymp 


0.005 


0.012 
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Table 3 

Bias, standard deviation and root mean square error of the global maximizers §lf^ and of 
the joint and exact Laplace likelihoods, respectively, for an MA (1) process generated by Laplace 
noise with 9o = 0.8, 0.9, 0.95, 1/0.95, 1/0.9, 1/0.8, a = 1, and n = 50 based on 1000 replications. 
First 2 columns record the number of times ( out of 1000 ) that the estimates were less than 1 
(invertible) and equal to 1 (unit root). 



0.8 
0.9 
0.95 

1/0.95 

1/0.9 

1/0.8 



"50 
fj(J) 

n(E) 
^50 

^50 
n(J) 

o() 
f,(E) 



< 1 


= 1 


bias 


s.d. 


rmse 


789 


95 


0.0734 


0.1973 


0.2105 


873 


19 


0.0498 


0.1753 


0.1822 


557 


322 


0.0578 


0.1398 


0.1513 


767 


93 


0.0327 


0.0933 


0.0989 


404 


503 


0.0322 


0.0708 


0.0778 


632 


168 


0.0235 


0.0821 


0.0854 


90 


540 


-0.0315 


0.0763 


0.0825 


286 


114 


-0.0207 


0.0890 


0.0914 


89 


299 


-0.0389 


0.1227 


0.1287 


207 


71 


-0.0327 


0.1218 


0.1261 


96 


109 


-0.0338 


0.2645 


0.2666 


149 


19 


-0.0492 


0.2280 


0.2333 



estimators was compared in a limited simulation study. We considered 3 values of 
6*0 = 0.8, 0.9, 0.95 and their reciprocals 1/0.8, 1/0.9, 1/0.95. The latter 3 cases cor- 
respond to purely non-invertible models. The results reported in Table [3] are based 
on the global optimization of the joint and exact likelihoods. The first two columns 
contain the number of realizations out of 1000 in which the estimator was invertible 
(< 1) and on the unit circle (=1), respectively. For example, in the 9^ = 0.8 and 
On'' case, 78.9% of the realizations produced invertible models, and the empirical 
pile-up probability is 0.095. On the other hand, for 6^ = 1/0.8, 79.5% of the realiza- 
tions produced a purely non-invcrtible model with an empirical pile-up probability 
of 0.109. Both objective functions do a reasonably good job of discriminating be- 
tween invertible and non-invertible models, with a performance edge going to the 
exact likelihood. In terms of root mean square error, the performance of 9n is 
superior to oif"^ as Oo moves away from the unit circle. 

Remark. The LAD estimate of 6*0 is obtained by minimizing the objective function 
given in (|2.2p with Zmit = 0. Although we have not considered the asymptotic pile- 
up in this case, the estimator does not perform as well as oif^ and ^i^^ . For example, 
in simulation results, not reported here, the rmse of the LAD estimator tended to 
be twice as large as the rmse for the exact MLE. 
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